2.3. ResNet and ResNet 您所在的位置:网站首页 resnet18 flops 2.3. ResNet and ResNet

2.3. ResNet and ResNet

#2.3. ResNet and ResNet| 来源: 网络整理| 查看: 265

2.3.1. Overview¶

The ResNet series model was proposed in 2015 and won the championship in the ILSVRC2015 competition with a top5 error rate of 3.57%. The network innovatively proposed the residual structure, and built the ResNet network by stacking multiple residual structures. Experiments show that using residual blocks can improve the convergence speed and accuracy effectively.

Joyce Xu of Stanford university calls ResNet one of three architectures that “really redefine the way we think about neural networks.” Due to the outstanding performance of ResNet, more and more scholars and engineers from academia and industry have improved its structure. The well-known ones include wide-resnet, resnet-vc, resnet-vd, Res2Net, etc. The number of parameters and FLOPs of resnet-vc and resnet-vd are almost the same as those of ResNet, so we hereby unified them into the ResNet series.

The models of the ResNet series released this time include 14 pre-trained models including ResNet50, ResNet50_vd, ResNet50_vd_ssld, and ResNet200_vd. At the training level, ResNet adopted the standard training process for training ImageNet, while the rest of the improved model adopted more training strategies, such as cosine decay for the decline of learning rate and the regular label smoothing method,mixup was added to the data preprocessing, and the total number of iterations increased from 120 epoches to 200 epoches.

Among them, ResNet50_vd_v2 and ResNet50_vd_ssld adopted knowledge distillation, which further improved the accuracy of the model while keeping the structure unchanged. Specifically, the teacher model of ResNet50_vd_v2 is ResNet152_vd (top1 accuracy 80.59%), the training set is imagenet-1k, the teacher model of ResNet50_vd_ssld is ResNeXt101_32x16d_wsl (top1 accuracy 84.2%), and the training set is the combination of 4 million data mined by imagenet-22k and ImageNet-1k . The specific methods of knowledge distillation are being continuously updated.

The FLOPS, parameters, and inference time on the T4 GPU of this series of models are shown in the figure below.

As can be seen from the above curves, the higher the number of layers, the higher the accuracy, but the corresponding number of parameters, calculation and latency will increase. ResNet50_vd_ssld further improves the accuracy of top-1 of the ImageNet-1k validation set by using stronger teachers and more data, reaching 82.39%, refreshing the accuracy of ResNet50 series models.



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有